1,494 research outputs found

    Models, Inference, and Implementation for Scalable Probabilistic Models of Text

    Get PDF
    Unsupervised probabilistic Bayesian models are powerful tools for statistical analysis, especially in the area of information retrieval, document analysis and text processing. Despite their success, unsupervised probabilistic Bayesian models are often slow in inference due to inter-entangled mutually dependent latent variables. In addition, the parameter space of these models is usually very large. As the data from various different media sources--for example, internet, electronic books, digital films, etc--become widely accessible, lack of scalability for these unsupervised probabilistic Bayesian models becomes a critical bottleneck. The primary focus of this dissertation is to speed up the inference process in unsupervised probabilistic Bayesian models. There are two common solutions to scale the algorithm up to large data: parallelization or streaming. The former achieves scalability by distributing the data and the computation to multiple machines. The latter assumes data come in a stream and updates the model gradually after seeing each data observation. It is able to scale to larger datasets because it usually takes only one pass over the entire data. In this dissertation, we examine both approaches. We first demonstrate the effectiveness of the parallelization approach on a class of unsupervised Bayesian models--topic models, which are exemplified by latent Dirichlet allocation (LDA). We propose a fast parallel implementation using variational inference on the MapRe- duce framework, referred to as Mr. LDA. We show that parallelization enables topic models to handle significantly larger datasets. We further show that our implementation--unlike highly tuned and specialized implementations--is easily extensible. We demonstrate two extensions possible with this scalable framework: 1) informed priors to guide topic discovery and 2) extracting topics from a multilingual corpus. We propose polylingual tree-based topic models to infer topics in multilingual corpora. We then propose three different inference methods to infer the latent variables. We examine the effectiveness of different inference methods on the task of machine translation in which we use the proposed model to extract domain knowledge that considers both source and target languages. We apply it on a large collection of aligned Chinese-English sentences and show that our model yields significant improvement on BLEU score over strong baselines. Other than parallelization, another approach to deal with scalability is to learn parameters in an online streaming setting. Although many online algorithms have been proposed for LDA, they all overlook a fundamental but challenging problem-- the vocabulary is constantly evolving over time. To address this problem, we propose an online LDA with infinite vocabulary--infvoc LDA. We derive online hybrid inference for our model and propose heuristics to dynamically order, expand, and contract the set of words in our vocabulary. We show that our algorithm is able to discover better topics by incorporating new words into the vocabulary and constantly refining the topics over time. In addition to LDA, we also show generality of the online hybrid inference framework by applying it to adaptor grammars, which are a broader class of models subsuming LDA. With proper grammar rules, it simplifies to the exact LDA model, however, it provides more flexibility to alter or extend LDA with different grammar rules. We develop online hybrid inference for adaptor grammar, and show that our method discovers high-quality structure more quickly than both MCMC and variational inference methods

    Predicting Brain Age Based on Spatial and Temporal Features of Human Brain Functional Networks

    Get PDF
    The organization of human brain networks can be measured by capturing correlated brain activity with functional MRI data. There have been a variety of studies showing that human functional connectivities undergo an age-related change over development. In the present study, we employed resting-state functional MRI data to construct functional network models. Principal component analysis was performed on the FC matrices across all the subjects to explore meaningful components especially correlated with age. Coefficients across the components, edge features after a newly proposed feature reduction method as well as temporal features based on fALFF, were extracted as predictor variables and three different regression models were learned to make prediction of brain age. We observed that individual's functional network architecture was shaped by intrinsic component, age-related component and other components and the predictive models extracted sufficient information to provide comparatively accurate predictions of brain age

    Discovering latent structure in task-oriented dialogues.

    Get PDF
    Abstract A key challenge for computational conversation models is to discover latent structure in task-oriented dialogue, since it provides a basis for analysing, evaluating, and building conversational systems. We propose three new unsupervised models to discover latent structures in task-oriented dialogues. Our methods synthesize hidden Markov models (for underlying state) and topic models (to connect words to states). We apply them to two real, non-trivial datasets: human-computer spoken dialogues in bus query service, and humanhuman text-based chats from a live technical support service. We show that our models extract meaningful state representations and dialogue structures consistent with human annotations. Quantitatively, we show our models achieve superior performance on held-out log likelihood evaluation and an ordering task

    A Dual-Band Microwave Filter Design for Modern Wireless Communication Systems

    Get PDF
    Nowadays, modern communication system relies on the designs of high-performance devices to enhance communication effect for a high quality of life and smart city system. As a crucial signal processing step, microwave filter removes unwanted frequency components away from the received signal and enhances the useful ones. However, large loss, bulky size, and single-band greatly limit the practical applications in urban computing. Therefore, the filters with dual-band characteristic are highly desirable for modern wireless communication, such as device-to-device communication, environment monitoring, and automatic driving. In this paper, a dual-band microwave filter is designed and fabricated based on the theory of Mie-resonance extraordinary transmission. An electromagnetic wave cannot propagate through a subwavelength aperture drilled in a metallic film. By adding two dielectric cuboids of different sizes into the two apertures, two passbands appear in the frequency range of 10.0-12.0 GHz. In this range, the insertion loss is less than 0.4 dB, and 3-dB bandwidth is more than 48 MHz. Particularly, the two passband frequencies can be tuned by adjusting the size of the dielectric cuboids. This approach opens a way for designing tunable dual-band microwave bandpass filter, which is benefit for enhancing spectrum resource utilization

    Aging Response and Precipitation Behavior After 5% Pre-Deformation of an Al-Mg-Si-Cu Alloy

    Get PDF
    In this study, Al-1.00 Mg-0.65 Si-0.24 Cu alloy was solution heat-treated, water-quenched, and then pre-deformed for 5% before aging. The peak hardness and yield strength of the pre-deformed sample with subsequent artificial aging were similar to that of a T6 condition sample. It was also found that the pre-deformation treatment could inhibit the negative influence of natural aging to some degree. After seven days of natural aging, the pre-deformed sample obtained better peak hardness and yield strength upon artificial aging than the sample without pre-deformation. In addition, the pre-deformation treatment could reduce 50% of the artificial aging time to reach the peak aging condition compared with T6 treatment. For the peak aged condition in the pre-deformed sample, transmission electron microscopy (TEM) observation found two types of precipitates exhibited along the dislocations besides the β″ precipitates in the Al matrix. Both precipitates had disordered atomic arrangements on the ordered subcell (Si network). The disordered precipitates occupied a number of Mg and Si atoms, resulting in less β″ precipitates formed during artificial aging at 180 °C
    • …
    corecore